30 research outputs found

    Subspace Adaptation Prior for Few-Shot Learning

    Full text link
    Gradient-based meta-learning techniques aim to distill useful prior knowledge from a set of training tasks such that new tasks can be learned more efficiently with gradient descent. While these methods have achieved successes in various scenarios, they commonly adapt all parameters of trainable layers when learning new tasks. This neglects potentially more efficient learning strategies for a given task distribution and may be susceptible to overfitting, especially in few-shot learning where tasks must be learned from a limited number of examples. To address these issues, we propose Subspace Adaptation Prior (SAP), a novel gradient-based meta-learning algorithm that jointly learns good initialization parameters (prior knowledge) and layer-wise parameter subspaces in the form of operation subsets that should be adaptable. In this way, SAP can learn which operation subsets to adjust with gradient descent based on the underlying task distribution, simultaneously decreasing the risk of overfitting when learning new tasks. We demonstrate that this ability is helpful as SAP yields superior or competitive performance in few-shot image classification settings (gains between 0.1% and 3.9% in accuracy). Analysis of the learned subspaces demonstrates that low-dimensional operations often yield high activation strengths, indicating that they may be important for achieving good few-shot learning performance. For reproducibility purposes, we publish all our research code publicly.Comment: Accepted at Machine Learning Journal, Special Issue of the ECML PKDD 2023 Journal Trac

    Understanding Transfer Learning and Gradient-Based Meta-Learning Techniques

    Full text link
    Deep neural networks can yield good performance on various tasks but often require large amounts of data to train them. Meta-learning received considerable attention as one approach to improve the generalization of these networks from a limited amount of data. Whilst meta-learning techniques have been observed to be successful at this in various scenarios, recent results suggest that when evaluated on tasks from a different data distribution than the one used for training, a baseline that simply finetunes a pre-trained network may be more effective than more complicated meta-learning techniques such as MAML, which is one of the most popular meta-learning techniques. This is surprising as the learning behaviour of MAML mimics that of finetuning: both rely on re-using learned features. We investigate the observed performance differences between finetuning, MAML, and another meta-learning technique called Reptile, and show that MAML and Reptile specialize for fast adaptation in low-data regimes of similar data distribution as the one used for training. Our findings show that both the output layer and the noisy training conditions induced by data scarcity play important roles in facilitating this specialization for MAML. Lastly, we show that the pre-trained features as obtained by the finetuning baseline are more diverse and discriminative than those learned by MAML and Reptile. Due to this lack of diversity and distribution specialization, MAML and Reptile may fail to generalize to out-of-distribution tasks whereas finetuning can fall back on the diversity of the learned features.Comment: Accepted at Machine Learning Journal, Special Issue on Discovery Science 202

    A Survey of Deep Meta-Learning

    Get PDF
    Deep neural networks can achieve great successes when presented with large data sets and sufficient computational resources. However, their ability to learn new concepts quickly is quite limited. Meta-learning is one approach to address this issue, by enabling the network to learn how to learn. The exciting field of Deep Meta-Learning advances at great speed, but lacks a unified, insightful overview of current techniques. This work presents just that. After providing the reader with a theoretical foundation, we investigate and summarize key methods, which are categorized into i) metric-, ii) model-, and iii) optimization-based techniques. In addition, we identify the main open challenges, such as performance evaluations on heterogeneous benchmarks, and reduction of the computational costs of meta-learning.Comment: Extended version of book chapter in 'Metalearning: Applications to Automated Machine Learning and Data Mining' (2nd edition, forthcoming

    Are LSTMs Good Few-Shot Learners?

    Full text link
    Deep learning requires large amounts of data to learn new tasks well, limiting its applicability to domains where such data is available. Meta-learning overcomes this limitation by learning how to learn. In 2001, Hochreiter et al. showed that an LSTM trained with backpropagation across different tasks is capable of meta-learning. Despite promising results of this approach on small problems, and more recently, also on reinforcement learning problems, the approach has received little attention in the supervised few-shot learning setting. We revisit this approach and test it on modern few-shot learning benchmarks. We find that LSTM, surprisingly, outperform the popular meta-learning technique MAML on a simple few-shot sine wave regression benchmark, but that LSTM, expectedly, fall short on more complex few-shot image classification benchmarks. We identify two potential causes and propose a new method called Outer Product LSTM (OP-LSTM) that resolves these issues and displays substantial performance gains over the plain LSTM. Compared to popular meta-learning baselines, OP-LSTM yields competitive performance on within-domain few-shot image classification, and performs better in cross-domain settings by 0.5% to 1.9% in accuracy score. While these results alone do not set a new state-of-the-art, the advances of OP-LSTM are orthogonal to other advances in the field of meta-learning, yield new insights in how LSTM work in image classification, allowing for a whole range of new research directions. For reproducibility purposes, we publish all our research code publicly.Comment: Accepted at Machine Learning Journal, Special Issue of the ECML PKDD 2023 Journal Trac

    Preparing children with a mock scanner training protocol results in high quality structural and functional MRI scans

    Get PDF
    We evaluated the use of a mock scanner training protocol as an alternative for sedation and for preparing young children for (functional) magnetic resonance imaging (MRI). Children with severe mental retardation or developmental disorders were excluded. A group of 90 children (median age 6.5 years, range 3.65–14.5 years) participated in this study. Children were referred to the actual MRI investigation only when they passed the training. We assessed the pass rate of the mock scanner training sessions. In addition, the quality of both structural and functional MRI (fMRI) scans was rated on a semi-quantitative scale. The overall pass rate of the mock scanner training sessions was 85/90. Structural scans of diagnostic quality were obtained in 81/90 children, and fMRI scans with sufficient quality for further analysis were obtained in 30/43 of the children. Even in children under 7 years of age, who are generally sedated, the success rate of structural scans with diagnostic quality was 53/60. FMRI scans with sufficient quality were obtained in 23/36 of the children in this younger age group. The association between age and proportion of children with fMRI scans of sufficient quality was not statistically significant. We conclude that a mock MRI scanner training protocol can be useful to prepare children for a diagnostic MRI scan. It may reduce the need for sedation in young children undergoing MRI. Our protocol is also effective in preparing young children to participate in fMRI investigations

    Extended Treatment with Apixaban for Venous Thromboembolism Prevention in the Netherlands:Clinical and Economic Effects

    Get PDF
    Background ?Dutch guidelines advise extended anticoagulant treatment with direct oral anticoagulants or vitamin K antagonists for patients with idiopathic venous thromboembolism (VTE) who do not have high bleeding risk. Objectives ?The aim of this study was to analyze the economic effects of extended treatment of apixaban in the Netherlands, based on an updated and adapted previously published model. Methods ?We performed a cost-effectiveness analysis simulating a population of 1,000 VTE patients. The base-case analysis compared extended apixaban treatment to no treatment after the first 6 months. Five additional scenarios were conducted to evaluate the effect of different bleeding risks and health care payers' perspective. The primary outcome of the model is the incremental cost-effectiveness ratio (ICER) in costs (€) per quality-adjusted life-year (QALY), with one QALY defined as 1 year in perfect health. To account for any influence of the uncertainties in the model, probabilistic and univariate sensitivity analyses were conducted. The treatment was considered cost-effective with an ICER less than €20,000/QALY, which is the most commonly used willingness-to-pay (WTP) threshold for preventive drugs in the Netherlands. Results ?The model showed a reduction in recurrent VTE and no increase in major bleeding events for extended treatment in all scenarios. The base-case analysis showed an ICER of €9,653/QALY. The probability of being cost-effective for apixaban in the base-case was 70.0% and 91.4% at a WTP threshold of €20,000/QALY and €50,000/QALY, respectively. Conclusion ?Extended treatment with apixaban is cost-effective for the prevention of recurrent VTE in Dutch patients

    A meta-analysis of genome-wide association studies identifies multiple longevity genes

    Get PDF
    Human longevity is heritable, but genome-wide association (GWA) studies have had limited success. Here, we perform two meta-analyses of GWA studies of a rigorous longevity phenotype definition including 11,262/3484 cases surviving at or beyond the age corresponding to the 90th/99th survival percentile, respectively, and 25,483 controls whose age at death or at last contact was at or below the age corresponding to the 60th survival percentile. Consistent with previous reports, rs429358 (apolipoprotein E (ApoE) ε4) is associated with lower odds of surviving to the 90th and 99th percentile age, while rs7412 (ApoE ε2) shows the opposite. Moreover, rs7676745, located near GPR78, associates with lower odds of surviving to the 90th percentile age. Gene-level association analysis reveals a role for tissue-specific expression of multiple genes in longevity. Finally, genetic correlation of the longevity GWA results with that of several disease-related phenotypes points to a shared genetic architecture between health and longevity

    Foreword

    No full text
    This volume contains the proceedings of the Eight Workshop on Specification and Verification of Component-Based Systems (SAVCBS 2009), affiliated with the 7th joint meeting of the European Software Engineering Conference (ESEC) and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE). SAVCBS 2009 took place in Amsterdam, Netherlands on August 25, 2009. SAVCBS is focused on using formal (i.e., mathematical) techniques to establish a foundation for the specification and verification of component-based systems. Specification techniques are urgently needed to support effective reasoning about systems composed from components. Component-based approaches also underscore the need for scaling advanced verification techniques such as extended static analysis and model checking to the size of real systems. The workshop considers formalization of both functional and non-functional behavior (such as performance or reliability). SAVCBS aims to bring together researchers and practitioners in the areas of component-based software and formal methods to address the open problems in modular specification and verification of systems composed from components. The workshop seeks to bridge the gap between principles and practice on this research area. The intent of bringing participants together at the workshop is to help form a community-oriented understanding of the relevant research problems and to help steer formal methods research in a direction that will address the problems of component-based systems. The goals of the workshop are to produce: 1. Contacts and discussion among researchers and practitioners, and 2. A web site that will be maintained after the workshop to act as a central clearinghouse for research in this area. We enthusiastically thank the authors of submitted papers; their quality contributions and participation are what make a workshop like SAVCBS successful. We thank the program committee for their careful reading and reviewing of the submissions. Our PC members have expertise in a wide variety of subdisciplines related to specification and verification of component-based systems; they include established research leaders and promising recent Ph.D.s; they come from academia and esteemed research institutes, and hail from all over the world. We received 5 research paper submissions. All papers were reviewed by 3 PC members. After PC discussions, 4 papers were accepted. This year's program also includes solutions to a specification and verification challenge problem for Database libraries. The specification challenge was to specify a core subset of a library as JDBC (java.sql) library (in Java) or ODBC (System.Data and/or System.Data.SqlClient) (in Microsoft ADO.NET), and in particular to address either coordination among related objects, the use of complex, multi-dimensional, multi-modal interfaces, or asynchrony. The verification challenge was to verify an adapter from Apache Beehive that implements an Iterator interface in terms of the JDBC ResultSet. Solutions to the challenge problem are available online via http://www.eecs.ucf.edu/SAVCBS/2009. This year, we are pleased to have a keynote talk by Mariëlle Stoelinga, University of Twente, Netherlands, and an invited talk by Natasha Sharygina, University of Lugano, Switzerland
    corecore